Quantization and Pruning of Multilayer Perceptrons: towards Compact Neural Networks
نویسندگان
چکیده
i Preface A connectionist system or neural network is a massively parallel network of weighted interconnections, which connect one or more layers of non-linear processing elements (neurons). To fully proot from the inherent parallel processing of these networks, development of parallel hardware implementations is essential. However, these hardware implementations often diier in various ways from the ideal mathematical description of a neural network model. It is, for example, required to have quantized network parameters, in both electronic and optical implementations of neural networks. This can be because device operation is quantized or a coarse quantization of network parameters is beneecial for designing compact networks. Most of the standard algorithms for training neural networks are not suitable for quantized networks because they are based on gradient descent and require a high accuracy of the network parameters Several weight discretization techniques have been developed to reduce the required accuracy further without deterioration of network performance. One of the earliest of these techniques Fiesler-88] is further investigated and improved in this report. Another way to obtain compact networks is by minimizing their topology for the problem at hand. However, it is impossible to know a priori the size of such minimal network topology. An unsuitable topology will increase the training time, lower the generalization performance on unseen test data Gosh-94], and in some cases even cause non-convergence. One method to lower the importance of choosing the initial network topology and minimizing the network size is pruning, that is, removal of connections or neurons during training. Especially a combination of parameter/weight quantization and network pruning, leading to networks that have a small topology and for which small accuracy is suucient, is of great importance for hardware implementation of neural networks. Such networks ooer a minimization of chip area and computational requirements. Due to their lack of redundancy they are also expected to show a better generalization on unseen patterns (Occam's razor). Such a combination of pruning techniques with weight quantization is studied in the second part of this report. Five diierent quantization functions, chapter 3, and six pruning methods, chapter 4, are evaluated in a series of experiments on real-world benchmarks problems. The main goal is to rst improve the original weight discretization technique as much as possible to obtain networks with both a small number of discrete weight levels and good generalization performance on unseen test data. Secondly, the results from the rst …
منابع مشابه
Worst case analysis of weight inaccuracy effects in multilayer perceptrons
We derive here a new method for the analysis of weight quantization effects in multilayer perceptrons based on the application of interval arithmetic. Differently from previous results, we find worst case bounds on the errors due to weight quantization, that are valid for every distribution of the input or weight values. Given a trained network, our method allows to easily compute the minimum n...
متن کاملHourly Wind Speed Prediction using ARMA Model and Artificial Neural Networks
In this paper, a comparison study is presented on artificial intelligence and time series models in 1-hour-ahead wind speed forecasting. Three types of typical neural networks, namely adaptive linear element, multilayer perceptrons, and radial basis function, and ARMA time series model are investigated. The wind speed data used are the hourly mean wind speed data collected at Binalood site in I...
متن کاملClassification, Association and Pattern Completion using Neural Similarity Based Methods
A framework for Similarity-Based Methods (SBMs) includes many classification models as special cases: neural network of the Radial Basis Function Networks type, Feature Space Mapping neurofuzzy networks based on separable transfer functions, Learning Vector Quantization, variants of the k nearest neighbor methods and several new models that may be presented in a network form. Multilayer Percept...
متن کاملSupervised Models C1.2 Multilayer perceptrons
This section introduces multilayer perceptrons, which are the most commonly used type of neural network. The popular backpropagation training algorithm is studied in detail. The momentum and adaptive step size techniques, which are used for accelerated training, are discussed. Other acceleration techniques are briefly referenced. Several implementation issues are then examined. The issue of gen...
متن کاملSteganalysis of embedding in difference of image pixel pairs by neural network
In this paper a steganalysis method is proposed for pixel value differencing method. This steganographic method, which has been immune against conventional attacks, performs the embedding in the difference of the values of pixel pairs. Therefore, the histogram of the differences of an embedded image is di_erent as compared with a cover image. A number of characteristics are identified in the di...
متن کامل